o 

CD 

Q 



Tunable Locally-Optimal Geographical Forwarding 
in Wireless Sensor Networks with 
Sleep-Wake Cycling Nodes 

K.P. Naveen and Anurag Kumar 
Dept. of E.C.E., Indian Institute of Science, Bangalore 560 012, India, 
{naveenkp, anurag} @ece.iisc.ernet.in 



Abstract 



o 

We consider a wireless sensor network whose main function is to detect certain infrequent alarm events, and to forward alarm 
packets to a base station, using geographical forwarding. The nodes know their locations, and they sleep-wake cycle, waking 
up periodically but not synchronously. In this situation, when a node has a packet to forward to the sink, there is a trade-off 
between how long this node waits for a suitable neighbor to wake up and the progress the packet makes towards the sink once 
it is forwarded to this neighbor. Hence, in choosing a relay node, we consider the problem of minimizing average delay subject 
to a constraint on the average progress. By constraint relaxation, we formulate this next hop relay selection problem as a Markov 
decision process (MDP). The exact optimal solution (BF (Best Forward)) can be found, but is computationally intensive. Next, 
^s^j . we consider a mathematically simplified model for which the optimal policy (SF (Simplified Forward)) turns out to be a simple 

one-step-look-ahead rule. Simulations show that SF is very close in performance to BF, even for reasonably small node density. 
We then study the end-to-end performance of SF in comparison with two extremal policies: Max Forward (MF) and First Forward 
(FF), and an end-to-end delay minimising policy proposed by Kim et al. [1]. We find that, with appropriate choice of one hop 
average progress constraint, SF can be tuned to provide a favorable trade-off between end-to-end packet delay and the number of 
^ ■ hops in the forwarding path. 

CD ' 

I. Introduction 

m ■ 

An important application of wireless sensor networks (WSN) is dense embedded sensing for the purpose of detecting certain 
(~~) " infrequently occuring events, such as failures in a large structure, or intrusion into a secure region. Such an event can occul- 
ta . anywhere in a large WSN, and once an event is detected, the alarm needs to be rapidly sent to the sink for further action. 
OO ■ In such WSNs, typically the nodes rely on batteries, or energy harvested from their surroundings, and, hence, need to be 
, extremely parsimonius in their use of energy. In order to conserve energy, the nodes operate in sleep-wake cycles; when a node 
wakes up it performs sensing, and also can assist in forwarding any alarm packets towards the sink. In this paper, we consider 
^ J the situation in which the sleep-wake cycles of nodes are not synchronized. In such a setting, stateful routing is not possible. 
£^ ' Instead, if the nodes know their own locations and that of the sink, then it is possible to dynamically select forwarding nodes 
• • . that are successively nearer to the sink. This is called geographical routing, and has been widely studied as a simple scalable 
' approach for routing in sensor networks [2], [3], [4], [5]. For the purpose of location determination, low cost GPS devices 
. are now becoming available, and can be incorporated in the nodes; alternatively, approximate localization algorithms based on 
I-h ' various geometrical principles can also be used (see, for example, [6], [7]. For a survey on routing and localization, see [8], 
[9]. In this paper we assume that nodes know their exact locations and also the location of the sink. 

The relay node selection problem: In geographical forwarding, in our setting, there arises the problem of optimal relay node 
selection, which we now discuss. One approach is that of greedy forwarding, in which an intermediate node forwards the 
packet to its neighbor node that makes maximum progress towards the sink. This scheme is referred to as Most Forward within 
Radius (MFR) ([2], [3]). If the node density is large such that every node has a neighbor that is closer to the sink than itself, 
then the greedy approach can find routes close to the minimum hop paths. Following a minimum hop path is beneficial since 
it reduces the number of times the network needs to transmit the packet. 

However, when the nodes are sleep-wake cycling in an asynchronous manner there is a trade-off between the delay in relay 
node selection and the progress made towards the sink. For example, if MFR is implemented, then for an intermediate node 
to forward the packet to a relay node that makes the maximum progress towards the sink, the intermediate node will need to 
wait for all its neighbors closer to the sink than itself to wake up. This will result in an increase in the delay of the alarm that 
is being forwarded. In fact, a counterpart to the MFR policy could be the policy that forwards the packet to the first node that 
wakes up and is nearer to the sink than the intermediate node. In this paper we call this latter policy First Forward (FF), and 
the MFR policy, simply, Max Forward (MF). 

In this paper we study the above trade-off for the following one hop relaying problem. A node needs to forward a packet to 
the sink. There is a set of neighbors of the node that are nearer to the sink than the node; the forwarding set. The nodes are 
asynchronously sleep-wake cycling according to a certain model. We seek policies for relay node selection so as to minimize 
the delay in determining the relay node, subject to a constraint on the progress made towards the sink. We assume that each 
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node has at least one neighbor that is strictly closer to the sink than itself so that greedy forwarding will always find a path 
to sink. This is a reasonable assumption for large node densities. 
Our contributions: 

• The problem of minimizing average one hop delay subject to a constraint on the average progress made, when nodes 
wake up periodically, but not synchronously, is formulated as a Markov decision problem (MDP), and solved to yield the 
optimal policy which we call Best Forward (BF). See Section [IV] and Section [V] 

• In a mathematically simplified setting (i.i.d., exponentially distributed inter-wakeup times) the MDP approach is used to 
derive a threshold type policy, called Simplified Forward (SF). The threshold is a function of the constraint on progress, 
and the policy is to transmit to the first node which wakes up and makes a progress of more than the threshold. See 
Section [Vl] While such a policy has been proposed heuristically in previous works ([10], [11]), we have derived it from 
the MDP formulation and we show through simulations that the performance of this policy is close to that of BF. The 
simulation results are in Section IVIIII 

• Finally, we compare the end to end performance (average delay and hop counts) of the SF policy with the forwarding 
policy proposed by Kim et al. [1]. The approach of Kim et al. aims to achieve minimum average end-to-end delay, but at 
the expense of an initial configuration phase. The SF policy, however, does not need any global organization phase, and 
the progress constraint can be used to tune the end-to-end performance to suitably trade-off between end-to-end delay 
and the number of hops in the forwarding path. These results are reported in Section IVIIII 

II. Related Work 

Zorzi and Rao ([12]) consider a scenario similar to ours: geographical forwarding in a wireless mesh network in which 
the nodes know their locations, and are sleep-wake cycling. They propose GeRaF (Geographical Random Forwarding), a 
distributed relaying algorithm, whose objective is to carry a packet to its destination in as few hops as possible, by making 
as large progress as possible at each relaying stage. Thus, the objective is similar to the MFR algorithm, mentioned above 
([2], [3]). For their algorithm, the authors obtain the average number of hops (for given source-sink distance) as a function of 
the node density. These authors do not consider the trade-off between relay selection delay and the progress towards the sink, 
which is a major contribution of our work. 

Liu et al. ([11]) propose a relay selection approach as a part of CMAC, a protocol for geographical packet forwarding. 
With respect to the fixed sink, a node i has a forwarding set consisting of all nodes that make progress greater than ro (an 
algorithm parameter). If Y represent the delay until the first wake-up instant of a node in the forwarding set, and X is the 
corresponding progress made, then, under CMAC, node i chooses an ro that minimizes the expected normalized latency E[y]. 
The Random Asynchronous Wakeup (RAW) protocol ([10]) also considers transmitting to the first node to wake up that makes 
a progress greater than a threshold Th. Interestingly, this is also the structure of the optimal policy provided by one of our 
Markov decision process formulations. 

Kim et al. ([1]) consider a dense WSN in which the traffic model and sleep-wake cycling are similar to ours. An occasional 
alarm packet needs to be sent, from wherever in the network it is generated, to the sink. The nodes are asynchronously sleep- 
wake cycling. The authors develop an optimal anycast scheme to minimize average end-to-end delay from any node i to the 
sink. The optimization is also done over sleep-wake cycling patterns and rates. A dynamic programming approach is taken, 
with the stages being the number of hops to the sink. While the framework is similar to ours, Kim et al. do not consider the 
objective of spatial progress at each hop, which results in the reduction of hop counts along the forwarding paths, and thus 
in the reduction of node energy utilization. In our work, we have studied the trade-off, at a typical forwarding stage, between 
forwarding delay and the distance that the packet covers in the hop. 

Rossi et al. ([13]) consider the problem of geographical forwarding in a wireless sensor network in which each node knows 
its hop distance from the sink. For each link, there is a link cost (for example, energy cost) for forwarding a packet over that 
link. Thus, there are two end-to-end cost criteria for a forwarding path: the total link cost of the path, and the number of hops 
in the path. When a node, say i, has a packet to forward to the sink, it has to consider the trade-off between cost reduction and 
hop distance reduction; note that cost can be reduced by forwarding the packet to a neighbor node with the same hop distance 
to the sink, but using which the total link cost could be lower. The information available at i is the cost to all its neighbors, 
and the statistics of the costs-to-go from the neighbors. The major difference in our work is that we have a sequential decision 
problem at each stage, since the costs (wake-up delay) and rewards (progress towards the sink) are revealed as the nodes wake 
up, and only the statistics are known a priori. 

Chaporkar and Proutiere ([14]) consider the problem of a transmitter that needs to transmit over one of several available 
channels. The transmitter can probe the channels to determine channel state information in order to encode its transmissions. 
The trade-off is between the time taken to probe and the throughput advantage of finding a good channel. Some important 
differences between their model and ours are the following. In our work the trade-off is between the time taken to wait for 
a relay to wake up, and the spatial progress the relay makes towards the sink. In [14], the transmitter can use an unprobed 
channel, whereas in our problem a relay that has not yet woken up cannot be used. In [14], the transmitter can probe the 
channels in an order that it can choose (e.g., the stochastically best channel first); in our problem the relays wake up in a 
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random order that is not under the control of the transmitter. In [14] it is shown that if the use of an unprobed channel is not 
allowed then a one-step-look-ahead rule is optimal. This is similar to the solution we obtain for a simplified version of our 
model. Note that whereas the concern in [14] is only with one-step relaying, we also study how the one-step policy performs 
in terms of end-to-end objectives, namely, path delay and path hop count. 

III. System Model 

A. Node Deployment 

N identical sensor nodes are uniformly deployed in the square region [0, L} 2 . We take N to be a Poisson random variable of 
rate XL 2 where A is the node density. Let Xi, i = 1,2, N, be the locations of the nodes. Additional source and sink nodes 
are placed at fixed locations xo = (0, 0) and Xn+i = {L, L) respectively. Thus including the source and sink nodes, there are 
a total of N + 2 nodes in the disk. r c is the communication range of each node. Two nodes i and j are called neighbors if 
and only if \xi — Xj\ < r c . The distance between node i and sink (N + 1) is Li = \xn+i — Xi\. 

B. The Sleep-Wake Process 

To conserve energy, each node performs periodic sleep-wake cycling. The sleep-wake times of the nodes are not synchronized. 
Since we are interested in studying the delay incurred in routing due to sleep-wake cycling alone, we neglect the transmission 
delay, propagation delay and other overhead delays. This means that if node i has a packet to transmit to its neighboring node 
j, then i can transmit immediately at the instant j wakes up. We model this by taking the time for which a node stays awake 
to be zero. 

More formally, let Ti, i = 1,2, N + 1 be Ltd. random variables which are uniform on [0,T], where T is the period of 
the sleep wake cycle. Then node i wakes up at the periodic instants kT + Ti, k > 0. We define the waiting time for i to wake 
up at time t as, 

Wi(t) = M{kT + Ti > t : k > 0} -t (1) 

C. Forwarding Rules and Assumptions 

Forwarding rules dictate the actions a node can take when it has to transmit. We are interested in decentralized policies 
where a node can take decisions only by observing the activities in its neighborhood {i.e., the disk of radius r c centered around 
the node of interest). In this regard we impose some restrictions on the network. 

Traffic Model: There is a single packet in the network which is to be routed from the source to sink. At time 0, the packet 
is given to the source and the routing process begins. The nodes which get the packet for forwarding are called relay nodes. 
The packet traverses a sequence of relay nodes to eventually reach the sink, at which time the routing ends. Thus there is a 
single flow and further the flow consists of only one packet. This set up is reasonable, because in sensor networks we can 
assume that the events are sufficiently separated in time and/or location so that the flows due to two events do not intersect. 
To avoid multiple packet transmission by different nodes detecting the same event, the nodes can resolve among themselves to 
select one node (say the one closest to the sink), which can then transmit. Further, the information about an event comprises 
its location, and possibly target classification data, which along with some control bits can be easily incorporated in a single 
packet. This justifies the idea to study the performance of a single packet alone. 

Forwarding Set: Each node knows its location and the location of the sink. The forwarding set of a node is the set of its 
neighbors that are closer to the sink then itself. A relay node considers forwarding the packet only to a node in its forwarding 
set. Each node knows the number of neighbors in its forwarding set, but is not aware of their locations and wake times. While 
in this paper we assume that each node knows the number of nodes in its forwarding set, it would be desirable to develop 
forwarding algorithms that do not require even this knowledge. We leave this as future work, but in Section [VIII-BI we provide 
simulation results on the performance of our algorithm when the node takes the number of nodes in its forwarding region to 
be just the expected number of nodes. 

D. Some Notation 

To define a forwarding policy more formally, we begin by setting up some notation. Consider a generic node i which gets 
the packet to forward at some instant t. Let Si — {y : \y — Xi\ < r c , |xat + i — y\ < Li}. Si is the set of all points that are 
within the communication radius of i and are strictly closer to the sink than i (see Fig. [TJ (we ignore edge effects by assuming 
that Si C [0, L] 2 ). If Xj E Si then the progress made by j is Zj = L. L — Lj. Let be the number of nodes in Si. Note that 
Ni ~ Poisson(X\Si\), where \Si\ is the area of the region Si. Recall that node i knows Ni and hence we focus on the event 
{Ni = K} for some K > 0. 

Let the indices of the nodes in Si be arranged as ii, ijc, such that W^it) < Wi 2 (t) <,...,< Wi K {t). The corre- 
sponding values of progress are Z; 17 Zi 2 , Zi K . For simplicity, from here on we neglect i in the subscript and simply use 
Wx{t),...,W K {t) imdZ 1 ,Z 2 ,...,Z K . 



Fig. 1, Xj and rrjv+i are the locations of node i and sink respectively. Li is the distance between them, r c is the communication radius. Si is the set of 
all points that are within the communication radius of node i and closer to sink than i. Si is the shaded region in the figure. 



The locations of each of these K nodes are uniformly distributed in the region Si independent of the others. Hence the 
progress made by them are i.i.d. whose distribution is same as Z. The p.d.f. of Z is supported on [0, r c ] and is given by, 



fz(z) 

Where denotes the area of the region Si, 
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Let U\ = W\(t) and Uu = Wk(t) — Wk-i{t) for 2 < fc < K. We refer to {£4} as the inter-wakeup times. These are the 
waiting times between the wakeup instants of sucessive nodes in Si (see Fig. |2). Further Iff. and Z^ are independent. 
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Fig. 2. (Wfe(t), Zfr) represents the wake instant and the progress respectively, made by the Node in Si. These are shown as points in [0, T] X [0,r c 
U k is the inter-wakeup time between node i k and ik-i- 



The waiting times W\(t), W<z(£), Wif(t) are the order statistics of K i.i.d. random variables that are uniform on [0,T]. 
The p.d.f. of the k — th order statistics is [15, Chapter 2], 

, , K\u k - 1 {T -u) K ~ k 

MU) - (k-l)\(K-k)\T* (4) 
for < u < T. Also the joint p.d.f. of the k — th and / — th order statistics (for k < I) is [15, Chapter 2], 

Klu^iv - u) l - k -\T - v) K - 1 



fw k ,Wi(u,v) 



(k - - k - l)\(K - l)\T K 



(5) 
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for < u < v < T. Later we will be interested in the conditional p.d.f. fu k+1 \w k f° r I < k < K — I. Using the above 
equations we can write , 




for < w < T and < u < T - w. 
E. Single Hop Policy 

Decision process begins at the instant t at which node i gets the packet to forward. This is stage k = 0. The k — th (k > 1) 
decision instant is the time at which node it wakes up. 

A Single Hop (SH) policy tt is a sequence of mappings : < k < K}, where /ij : {(0,0)} — > {0} and for k > 1 
p,Z : [0,T] x [0,r c ] — -> {0,1}. tt should also satisfy ^(wjb) = 1. The function [i\ maps the state at stage k to an action 
(continue) or 1 (stop). Let D^(t) and Z*(t) denote the delay incurred and progress made by node i using policy tt. Forwarding 
rules for node i, using policy tt are as follows: 

• At stage 0, node i has to wait for further nodes to wake up. We represent this by allowing the only state at stage to be 
= (0,0) and the corresponding action to be to (continue to wait) i.e., /Jq(0) = 0. 

• If Li < r c , then wait for sink to wake up and transmit to it. In this case, the delay and progress made are D n (t) — WN+i(t) 
and Z*(t) = Li respectively. 

• Otherwise (i.e., if Li > r c ), wait for the nodes in Si to wake up. When node ij. wakes up (1 < k < K), evaluate 
p = fi%.(Wk(t),bk) where bf. = max{Z l5 Z^}. If p — 1, then transmit to the node i a rgmax{Zi,...,z fc }- The delay 
incurred is O 1 ^) = Wt{t) and the progress made is Z^(t) = b^. If p — 0, ask the node which makes the most progress 
so far to stay awake, put the other node to sleep and wait for further nodes to wake up. 

> The requirement n\(w, b) = 1 in the definition of tt ensures that node i transmits at or before the instant the last node 
wakes up. 

Since the distribution of {(Wk(t), Zj~) : 1 < k < K} are not dependent on the value of t, the average values of _D Tr (t) and 
Z* (t) also do not depend on t. Hence to compute these average values we can, without loss of generality, take t = and use 
and Z 77 to simplify the notation. 

Let II represent the class of all SH policies. Note that many policies are excluded from class n. For instance, the policy 
which waits for all the nodes to wake up and then transmits to the one which makes least progress does not belong to the 
class n. This is because for a policy in n, transmission is allowed only to the node that makes the most progress so far. We 
would like to explicitly mention two SH policies namely Max Forward (MF) and First Forward(FF): 

A node using Max Forward policy will wait for all the nodes in its forwarding set to wake up and then transmit to the one 
which makes most progress. We use ttmf to represent this policy. For this policy, /i£ MF (w, b) = 1 if and only if k = K. This 
policy obtains maximum delay and maximum progress among all other policies in class n. 

A node using First Forward policy will always transmit to the node in the forwarding set which wakes up first irrespective 
of the progress made by it. ttff is used to represent this policy. For this policy, \jl[ f * (w, b) — 1. ttff obtains minimum delay 
and minimum progress among all the policies in class n. 



From here on, without loss of generality we fix T = 1 and r c — 1. Let Pk (where K > 1) denote the probability law 
conditioned on the event {Ni ~ K} i.e., P_ff(.) = P(.|iVj = K). Similarly we define the conditional expectation E^. Define 
Imf = 'Ek[Z^ mf ] and ^ff = Ea-[Z 7Tff ], average progress made by the MF and FF policies respectively. 

Our interest in this work are, at a relay node i with Ni = K, to minimize the average delay subject to a constraint on the 
average progress achieved. More formally the problem is, 



where 7 e [0,7mf]- 

This formulation embodies the one-step tradeoff between the need to forward the packet quickly while attempting to make 
substantial progress towards the sink. The parameter 7 controls the tradeoff. A large 7 indicates our desire to make large 
progress in each step, which will come at a cost of a large one hop forwarding delay. 

To solve the problem in (O, we consider the following unconstrained problem, 



IV. Problem Formulation 
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Where 77 > 0. Let ttbf(v) (Best Forward) be the optimal solution for this problem. 

Lemma 1: For a given 7 in problem (|7), suppose there is an rj y such that Ek{Z 7Tbf ( ti ~> ,> ] = 7, then ttbf(Vi) is optimal for 
the problem in (Q as well. 

Proof: Since ttbf(Vi) is optimal for the problem in I©, 

E K [D WB '^)] - r} 1 E K [Z' KBF ^] < EkID 71 } - tItE k [Z*], for all vr e II 
i.e.E K [D* BF ^] < E K [D n ] — ?7 7 (E A '[Z 7r ] — 7) 

Therefore for any ir such that Ek [Z 71 ] > 7, we have 

IkID 1 "'^'] < E K [£) 7r ] 

■ 

In the subsequent sections we focus on solving the problem in (0. 



V. Optimal Policy for the Exact Model 

To solve the problem in ([8]), we develop it in a Markov Decision Process (MDP) framework [16]. X — [0, l] 2 UiV'} is the 
state space (recall that T = 1 and r c = 1). ip is the terminating state. C = {0, 1} is the control space where 1 is for stop and 
is for continue. A small change to the it defined earlier in section (IIH-Eb . is the inclusion of t/> in the domain of /zjf . Let 
(wk,bk) be the state at stage k where bk is the best (maximum) progress made by the nodes waking up until stage fc i.e., 
bk = max{Z\, ...,Zk}. Conditioned on being in state (wk,bk) at stage fe, transition to the next state depends on through 
Uk+i whose p.d.f. is fu k+1 \w k {-\wk) (Equation d6}). The other disturbance component Zk+i, is independent of the (wk,bk). 
p.d.f. of Zk+i is fz (Equation (f2]i). We define the conditional expectation, 

E (Wk=Wk) [.} = E K [.\W k =w k ] 

Then using expression © we can write, 



E (Wk=Wk) [U k+1 ]= ^ u *, (9) 



1 - Wk 

K-k + l 

Initial state sq = and initial action ao = always. Therefore the next state is si = (Ui, Z±) and the cost incurred at stage 
is <7o(0, 0) = U\. If afc G C is the action taken at stage 1 < k < K — 1, then the next state Sk+i is, 

(w k + U k +i,max{Zk+i,b k }) if a k = 
ip if afe = 1 



Sfc+l 

and the one step cost function is, 



g k {{wkM),a k ) = \ U _ k £ k i a a k k = \ (10) 



If the state at stage k is ip then Sk+i — ij) and gk{?P,CLk) = irrespective of a^. Also if is the state of the system at the 
last stage, there is a cost of termination, gi(( s K) given as, 



9k(sk) 

The total average cost incurred with policy ir is, 

M0)=E K 



if Sk = ip 

—rjbK otherwise 



K-l 



9k(s k , f4(s k )) + 9k{sk) 



fe=0 



The expectation in the cost function above is taken over the joint distribution of {([/&, Zk) : 1 < k < K}. Note that, 

J„(0)=E K [D"]-r)E K [Z*} 

Therefore the optimal cost is, 

J*(0)=min ■/„(<)) = J„ bfM (0) 

Let Jk(w, b) be the optimal cost to go when the system is in state (w, b) at stage 1 < k < K. When the stage is K (i.e., 
all the nodes have woken up), then invariably transmission has to happen. Therefore, 

JK{w,b) = -<qb 

= -7]meLx{b,4> K (w,b)} (11) 
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where, we define 4>k(w, b) = for all (w, b). Next when there is one more node to wake up (i.e., stage is K — 1) then both 
actions, a,K-i = 1 and ax-i = are possible. Therefore, 

Jk-i(u>, 6) = min {-7?6,E (W / if _ 1=1 „) [f7 A ' + + t/_R-,max{6, Zk})]} 

The terms in the min expression are the costs when a^_i = 1 (stop) and a,K-i — (continue) respectively. Using the 
expression for Jk in ( ITTb we obtain, 

J K -i(w,b) = rnin {—776, E(w ir _ 1=1 „) [{/«■ - ?7max{6, Z K ,(f> K (w + U K , max{6, Zk})}]} 

= -?7max{6, 0k-i(w,6)} (12) 



where, 



(j) K -l(w,b) = ^(W K -i=w) 



max{6, Z^r, 0k (to + [/j<-,niax{6, Z K })} - — — 



(13) 



The following lemma is obtained easily. 

Lemma 2: For every 1 < fc < if — 1, the following equations holds, 

Jk(w,b) = -rimax{b,(f) k (w,b)} (14) 

where, 

u k+ - 



<t>k{w,b) = E {Wk=w) 
Proof: Suppose for some 2 < k < K 



max{6, Z k +i, <pk+i{w + Uk+i,max{b, Z k +i})} 



(15) 



V 

equations (TT4T) and <fl~5T > holds, then following similar lines which was used to 
obtain (TT2T) and ( fT3l ) (just replace K by fc) we can show that ([Pil l and (fl~5b holds for fe— 1 as well. Since we have already shown 
that these equations hold for k = K — 1, from induction argument we can conclude that it holds for every 1 < k < K — 1. ■ 
The structure of the optimal policy is given in the following corollary. 
Corollary 3: The optimal policy itbf(v) I s °f th e following form, 

,r<*w> = {; ^tf-" 

for 1 < k < K . Where 4>k(w, b) = for all («;, b) € S and for 1 < fc < if — 1, 4>k(w, b) is given in equation ( fT31 l. ■ 
Remarks: The optimal policy requires threshold functions {(73fc} which are computionally intensive. For our later numerical 
work in Section dVIIIK we discretize the state space into 10 4 equally spaced points and use the approximate values of the 
functions ipk, 1 < k < K — 1 at these discrete points. 



VI. Optimal Policy for a Simplified Model 

The random variables {Uk ■ 1 < k < K} are identically distributed [15, Chapter 2] (but not independent). Their common 
c.d.f. is Fu k {u) = 1 — (1 — u) K . From Fig. [3] we observe that the c.d.f. of {Uk ■ 1 < k < K} is close to that of the c.d.f. of 
an exponential random variable of parameter K and the approximation becomes better for large values of K. This motivates 
us to consider a simplified model where {Uk ■ 1 < k < K} are distributed as Exponential(K). Further in our simplified model 
we take these random variables to be independent. 




0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 

u u 



(a) 

Fig. 3. The c.d.f.'s F\j k and Fy where Y ~ Exponential(K) are plotted for | (a) | if 



(b) 

= 5 and[(b)]_ft: = 15. 
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For the simplified model, the cost function (similar to ( fTUl l) when the system is in state (w, b) at stage 1 < k < K — 1 is, 

We observe that due to the Ltd. inter-wake time assumption the cost function is not dependent on the value of Wk- Also 
we need not consider conditioning on Wk = unlike in the previous section since the p.d.f. of Uk+i does not depend on 
Wk- Hence, the optimal policy for this model is going to be independent of Wk for each k. So we simplify the state space 
by ignoring the values of for each fc, i.e., the state space is X = [0, 1] Ul^}- Control space C and the other disturbance 
component Zk remain the same. Since the state space is different, we make a small change to the definition of policy ir by 
allowing /i£ : X — > C. The state transition and cost functions remain same as in the previous section with (w, b) replaced by 
b. Let ttsf(ii) represent the optimal policy for this model. 

Let Jk{b) be the optimal cost to go at stage fc when the state is b. Then, for all b £ [0, 1], 

J K ib) = -r/6 (18) 

Next when the stage is K — 1, for b G [0, 1], 

J K -i(b) = min{-r]b,E K [U K + J K (max{b,Z K })]} 
= min{— r/b, E^- \Uk — T]m&x{b, Zk}]} 

= -r/max{6,ft(»} (19) 
where ft is a function, which for b G [0, 1] is given by, 

ft (6) = E x [max{6,Z K }]-^M 



= E K [max{b, Z}} - ^- (20) 
r/K 

Here we have made use of the fact that E K [U K ] = and Z K ~ Z. The p.d.f. of Z is given in |0. Evidently, at stage K — 1, 
the optimal action is to stop and transmit the packet if b > ft (6) and to continue otherwise. The following results about ft(&) 
can easily be obtained, the proof of which we provide in Appendix. 
Lemma 4: 1) ft is continuous, increasing and convex in b. 

2) If ft (0) < 0, then ft (6) < b for all b G [0, 1] . 

3) If ft(0) > 0, then there is a unique a n such that ft (a,,) — a v . 

4) If /3i(0) > 0, then ft(6) < b for b G (a n , 1] and ft(6) > b for b G [0, a r) ). 

■ 

If ft(0) < 0, then define a v = 0. Otherwise a v is defined by ft (a,,) = a v . Then 

^ K - 1 { > ~ \ 0, otherwise 

We proceed to evaluate Jk-2- 

JK-2{b) = mml-rib^KpK-i + Jif-i(max{&, Z^_i})]} 

= min{-?76,E K [t r K-i - 7?max{&, Z^x, ft (max{6, ^-i})}]} 

= -7/max{6,ft(6)} (21) 

where, 

/3a (6) = E K [max{b,Z,fh(max{b,Z})}]--}- ( 22 ) 

77/v 

Lemma 5: (32(b) > ft (6) for any 6 G [0, 1]. In particular, if 6 > a v then ft (6) = ft (&). 

Proof: The first part follows easily because E^-[max{&, if}] < E^ [max{6, Z, ft (max{6, Z})}]. Next, if 6 > then 
from Lemma 0] max{6, Z} > ft (max{6, Z}), so that max{&, Z, ft(max{&, Z})} = max{6, Z}. Therefore, 

ft (6) = E A '[max{6,Z}] - — 

■ 

Lemma 6: For every 1 < fc < X — 2 the following holds, 

J fc (&) = -r ? max{6,ftr_ )fc (&)} (23) 
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where, 



@K-k(b) = E K [max{b, Z, f3 K _ {k+1) (m&x{b, Z})}} - — 



and has the property, 0K-k(b) > /3k -(fc+i) (&) f° r an Y b G [0, 1]. In particular, if b > a n then (3K-k{b) — Pi(b). 

Proof: Proof is along the lines used to obtain Equations (Bil l. (l22l and Lemma [5] ■ 

Corollary 7: The policy nsF{rj) is of the following form, 
M ^'^(6) = land 

{I — <24) 

for 1 < k < K - 1. ■ 

Remarks: The policy is a simple one-step-look-ahead rule where at each k (1 < k < K — 1) the policy compares the cost of 
stopping at k (C s = —T)b) with the cost of continuing for one more step and then stopping at k+1 (C c = j^ — ?7E^[max{6, Z}}). 
The policy is to stop if C s < C c (simplification yields, stop if b > a v ), continue otherwise. The policy is to transmit to the 
first node which makes a progress of more than a v . If all the nodes, make progress of less than a v then transmit to the node 
whose progress is maximum at the instant the last node wakes up. 



VII. Analytical Results 

In this section we apply the policy ttsf(ii) obtained from the simplified model to the actual model and obtain expressions 
for average progress and average delay incurred by node i. First we need some more notation. We abuse the notation S% by 
allowing Si(z) — {y : \y — X{\ < r c , \xn+i — y\ < Li — z}. Si(z) is the set of points that are closer to the sink than Xi by 
atleast z £ [0, 1] (see Fig. |4j. When z — 0, we simply use <Sj instead of 5j(0). Let p z = ,g Z ) , where |<Sj(^)| denotes the 
area of the region Si(z). p z is the conditional probability that a node falls in the region Si(z) conditioned on the event that 
the node belongs to <Sj. 




Fig. 4. Xi and xjv+i are the locations of node i and sink respectively. Li is the distance between them, r c is the communication radius. Si(z) is the set 
of all points that are within the communication radius of node i and are closer to sink than i by atleast z. Si(z) is the shaded region in the figure. 



A. Average values for ttsf(v) 

When using policy 7tsf(t?), node i transmits to the first node which makes a progress of more than a v . If there are k > 1 
nodes in the region Si(a v ), since the wake time of each of these is uniform on [0, 1] and independent of each other, the average 
time until the first one wakes up is ^4^-. If the region Si(a v ) is empty, then node i will wait for all the nodes in Si to wake 
up and then transmit to the one which makes the maximum progress. In this case the average delay is j^p£- Therefore, 

E*[27"'<«>] - E(f)<(l-^)"- fe ^T + (l-^)"^T (25) 

The expression for average progress can be written as, 

E K [Z*sr(.v)] = [ p K (Z 7TSF( - ri) > z)dz (26) 
Jo 

When z £ [0, a^] then the event ( y Z 7!SF ^ > z,Ni — K) is same as the event that at least one node makes a progress of more 
than z. Therefore for z S [0, a v ], 

V K (Z***M > z) = 1 - (1 - Pz f (27) 
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When z £ (a,,, 1] then the event (Z* s *w > z,Ni — K) is the same as the event that the region Si(a v ) is non-empty and the 
node to wake up first in this region makes a progress of more than z, the probabilty of which is -2s-. Therefore for z G (pt„, 1], 

F K (Z***W > z) = (l - (1 - Pa J K ) (28) 

B. Average Values for irpp 

The policy npp (First Forward) transmits to the node in the region Si which wakes up first, irrespective of the progress 
made by it. Therefore, 

E K [D^ F ] = -J— (29) 

A + 1 



Average progress is, 



E K [Z* FF ] = f P K {Z nFF > z)dz 
Jo 



l 

p z dz (30) 

o 

C. Average Values for 7Tm f 

The policy ttmf (Max Forward) always waits for all the nodes to wake up and then transmits to the node which makes the 
maximum progress. Therefore, 

E K [D*«*] = (31) 

A + 1 



Average progress is given by, 



(l - (1 - Pz ) K ) dz (32) 
VIII. Simulation Results 

A. One Hop Performance 

We apply the policies ttbf(v) an d t^sf{v) to tne actual model and obtain average progress and average one hop delay for 
Li = 10 and K = 5. Expressions for the average values for policies 713^(77), n FF an d tmf were obtained in Section [VTI1 
Since it is difficult to obtain similar analytical expressions for policy 7Tbf(^)> we have performed simulations to obtain these 
values. In Figs. |5(a)| and |5(b)| we plot the average values as a function of r\. The minimum and maximum values of average 
delay and progress are achieved by irpp and itmf respectively. From the figures we can observe that for values of 77 less 
than r] = Ek [ Z ] K the performance of ttsf(i]) is same as npp. This is because for 77 less than rj , we have /?i(0) < 0, and 
therefore the threshold used is a n = which is same as that used by wpp. 

By using a large value of 77, a node will value progress more and will end up waiting for better nodes to wake up thus 
incurring a large delay as well. Hence, delay and progress for both the policies (ttbf and ttsf) are increasing with r\. We can 
conclude from Lemma [T] that for each policy, BF or SF, and a given ?/, the corresponding delay value is the minimum that 
can be obtained using that policy, subject to a constraint on progress equal to the progress value obtained for that rj. These 
corresponding average delay vs. average progress values are shown in Fig. |5(c)| for K — 3, 5 and 15. Each point on the curve 
for each K corresponds to a different value of 77, which increases along the curves as shown. We see that the performance of 
the SF policy is close to that of the optimal BF policy, even for small values of K. The way 77 serves to trade-off one hop 
progress and delay is clearly shown by these curves. 



B. End to End Performance 

Although our policies have been developed for one-hop optimality, it is interesting to study their end-to-end performance 
if they were used, heuristically, at each hop. We compare the end-to-end performance of our policy with the work of Kim et 
al. [1] who have developed end-to-end delay optimal geographical forwarding in a setting similar to ours. We first give a brief 
description of their work. They minimize, for a given network, the average delay from any node to the sink when each node i 
wakes up asynchronously with rate 7^. They show that periodic wake up patterns obtain minimum delay among all sleep-wake 
patterns with the same rate. A relay node with a packet to forward, transmits a sequence of beacon-ID signals. They propose 
an algorithm called LOCAL-OPT [17] which yields, for each neighbor j of node i, an integer such that if j wakes up 
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(a) (b) (c) 

Fig. 5. One Hop Performance: |(a)| Average one hop progress as a function of r\ for various policies. The plots are for Li = 10 and K = 5. Maximum 
and minimum progress are achieved by ttmf an d V FF respectively. |(b)| Average one hop delay as a function of rj for various policies. The plots are for 
Li = 10 and K = 5. Maximum and minimum delay are achieved by ttmf and np f- 1(c)! Average one hop delay vs. the corresponding average one hop 
progress for the class of policies ng p and its p are plotted for K = 3, 5 and 15. The parameter r? controls the delay-progress trade-off. Each point on the 
curve corresponds to a different value of r] which increases along the direction shown. 



and listens to the h — th beacon signal from node i and if h < h\ \ then j will send an ACK to receive the packet from i. 

(i) 

Otherwise (if h > h) ) j will go back to sleep. A configuration phase is required to run the LOCAL-OPT algorithm. 

•* 1 

As before, we fix r c = 1 and T — 1 sec. Each node wakes up periodically with rate y but asynchronously. To make a fair 
comparision with the work of Kim et al. we introduce beacon-ID signals of duration tj = 5 msec and packet transmission 
duration of tp> = 30 msec. We fix a network by placing N nodes randomly in [0, L] 2 where L — 10. N is sampled from 
PoissonfXL 2 ) where A = 5. Additional source and sink nodes are placed at locations (0,0) and (L.L) respectively. Further 
we have considered a network where the forwarding set of each node is non-empty. The wake times of the nodes are sampled 
independently from Uniform([0,l ]). Description of the policies that we have implemented is given below. 
ttsf- We fix 7 as a network parameter. Each relay node chooses an appropriate 77 (in other words, chooses an appropriate 
threshold a v ) such that the average one hop progress made using the policy ttsf(i]) is equal to 7. Note that 77 depends on 
node i (i.e., on the values of Li and K). At a relay node i if 7 is less (greater) than the average progress made by ttff 
(ttmf) then we allow node i to use ttff (ttmf) to forward. When a node j wakes up and if it hears a beacon signal from 
i, it waits for the ID signal and then sends an ACK signal containing its location information. If the progress made by j is 
more than the threshold, then i forwards the packet to j (packet duration is try = 30 msec). If the progress made by j is less 
than the threshold, then i asks j to stay awake if its progress is the maximum among all the nodes that have woken up thus 
far, otherwise i asks j to return to sleep. If more than one node wakes up during the same beacon signal, then contentions are 
resolved by selecting the one which makes the most progress among them. In the simulation, this happens instantly (as also 
for the Kim et al. algorithm that we compare with); in practice this will require a splitting algorithm; see, for example, [18, 
Chapter 4.3]. We assume that within tj = 5 msec all these transactions (beacon signal, ID, ACK and contention resolution if 
any) are over, ttff and ttmf can be thought of as special cases of ttsf with thresholds of and 1 respectively. 
ttsf- This is same as ttsf except that here a relay node does not know K, but estimates its value as |_<M<5>i|J nodes where |<Sj| 
is the area of the region Si (Equation OJ). If there is no eligible node even after the — th beacon signal (one case when 
this is possible is when the actual number of nodes K is less than L^l^ilJ an d none of the nodes make a progress of more 
than the threshold) then i will select one which makes the maximum progress among all nodes. 

Kim et al.: We run the LOCAL-OPT algorithm [17] on the network and obtain the values hj for each pair where i and 
j are neighbors. We use these values to route from source to sink in the presence of sleep wake cycling. Contentions, if any, 
are resolved (instantly, in the simulation) by selecting a node j with the highest index. 

In Fig. [6] we plot average total delay vs. average hop count for different policies for fixed node placement, while the averaging 
is over the wake times of the nodes. Each point on the curve is obtained by averaging over 1000 transfers of the packet from 
the source node to the sink. As expected, Kim et al. achieves minimum average delay. In comparision with npp, Kim et al. 
also achieves smaller average hop count. Notice, however that using irgp policy and properly choosing 7, it is possible to 
obtain hop count similar to that of Kim et al., incurring only slightly higher delay. 

The advantage of ttsf over Kim et al. is that there is no need for a configuration phase. Each relay node has to only 
compute a threshold that depends on the parameter 7 which can be set as a network parameter during deployment. A more 
interesting approach would be to allow the source node to set 7 depending on the type of application. For delay sensitive 
applications it is appropriate to use a smaller value of 7 so that the delay is small, whereas, for energy constrained applications 
(where the network energy needs to conserved) it is better to use large 7 so that the number of hops (and hence the number 
of transmissions) is reduced. For other applications, moderate values of 7 can be used. 7 can be a part of the ID signal so 
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Fig. 6. End-to-end performance: Plot of average end-to-end delay vs. average end-to-end hop count when the one hop optimal policy for the progress 
constraint 7 is used at each hop. The operating points of the policies irpp, ttmf and Kim et al. are also shown in the figure. Each point on the curve 
corresponds to a different value of 7 which increases along the direction shown. 



that it is made available to the next hop relay. 

Another interesting observation from Fig.|6]is that the performance of ttsf is close to that of ttsf- In practice it might not 
be reasonable to expect a node to know the exact number of relays in the forwarding set. ttsf works with average number of 
nodes instead of the actual number. For small values of 7 both the policies itsf and ttsf, most of the time, transmit to the 
first node to wake up. Hence the performance is similar for small 7. For larger 7, we observe that the delay incurred by ttsf 
is larger. 

IX. Summary and Future Work 

The problem of optimal relay selection for geographical forwarding was formulated as one of minimizing the forwarding 
delay subject to a constraint on progress. The simple policy (SF) of transmitting to the first node that wakes up and makes a 
progress of more than a threshold was found to be close in performance to the optimal policy. We then compared the end-to-end 
performance (average delay and average hop count) of using SF at each relay node enroute to the sink with that of the policy 
proposed by Kim et al. [1], which is designed to achieve minimum average end-to-end delay. However, the delay obtained by 
the policy in [1] is only a little smaller than that obtained by the FF policy. Further, by using the SF policy with a appropriate 
7, performance very close to that of the policy in [1] can be obtained without the need for an initial global configuration phase. 
We note that ttsf is self-configuring; each node takes decisions based only on local information. The end-to-end performance 
obtained can be tuned by the use of a single parameter 7. For a small 7 we obtain low end-to-end delay but the number of 
hops is large and vice versa. 

In this work we have assumed that each node knows the number of neighbors in its forwarding set. We had given a heuristic 
policy tts f when the actual number of forwarding neighbors is not known. In future work we aim to obtain optimal forwarding 
policies by relaxing this assumption. Also, the use of a one-hop optimal policy for end-to-end forwarding is a heuristic. In 
future work we propose to directly formulate the end-to-end problem and derive optimal policies. In addition, we could also 
include aspects such as the relay's link quality in our formulation. 
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Appendix 
Proof of LemmaO 

Proof: [Proof ofg]l] Recall from Equation (f20]l that 



= E K [max{b,Z}] 



1 
~t~]K 



Let Fz represent the c.d.f. of Z. For b £ [0, 1], the c.d.f. of max{6, Z} is, 

if z 

if z > b 



„ , > _ j if z < 6 



Mb) 



f 1 1 

= / (1 - F max j b Z }(z))dz — 

Jo VK 

= b+ f (1-F z (z))dz 

Jb 



0i '(b) = Fz(b) > and j3\"(b) = fz(b) > implies that (3i is continuous, increasing and convex in b. ■ 

Proof: [Proof of[4]2] Since E/f [max{&, Z}] < 1, r\ > and K > 0, we have < 1. Also (3% is convex (from Lemma 

E]l). Hence we can write, 

Pi{b) < (1- 6)/?i(0) + 
< b 

■ 

Proof: [Proof ofg]3] Let g(b) = b - @i(b). Then, g(0) < and g(l) > (because < 1). Also g(b) is continuous 

(being differentiable) on [0, 1]. Hence, 3 an a v E [0, 1) such that g(a, t ) = 0. 

Suppose 3 an ajj > a v such that g(a' n ) = 0. Then by convexity of (3\ (from Lemma]4\l), 

1 — a' a' — ctr, 

< ^3iK +T -ftW 

1 - oc v 1 - a n 



i.e., > 1. Contradicts the fact that, /?i (1) < 1. ■ 

Proof: [Proof of|4]4] Again consider g(b) = b — (3i(b). g(b) is continuous (being differentiable) on [0,1]. Suppose 3 
b e {a v , 1] such that (3i(b) > b, then g(b) < and g(l) > 0. This implies that 3 b' in [b, 1) such that g(b') = 0. Contradicts 
the uniqueness of a v shown in Lemma @J3. Similarly it can be shown that fli(b) > b for b G [0, a v ). ■ 



